Use machine learning to predict the length of flight delays (in minutes).
📁 Flight data provided by Zindi, consisting of a train/test format for model development
🕒 Delay duration in minutes
📉 Root Mean Square Error (RMSE)
We analysed how delays are distributed based on:
| Column | Description |
|---|---|
| ID | Unique flight identifier |
| DATOP | Date of flight |
| FLTID | Flight number |
| DEPSTN | Departure point |
| ARRSTN | Arrival point |
| STD | Scheduled time of departure |
| STA | Scheduled time of arrival |
| STATUS | Flight status |
| AC | Aircraft code |
| target | Flight delay (min) |
Removing Service Flights, i.e. flights where departure and arrival airports are the same …
A simple linear regression model using only the day of the week
(or aircraft code) as the predictor.
Many categorical variables …
… but there is:
A regression model using CatBoost with the following predictors: Flight Status; Aircraft Code; Departure and Arrival Point; Year, Month and Weekday of Departure Time
We're happy to answer your questions and look forward to your feedback.